Text , Speech , and Vision for Video Segmentation : The InformediaTM Project

نویسندگان

  • Alexander G. Hauptmann
  • Michael A. Smith
چکیده

We describe three technologies involved in creating a digital video library suitable for fullcontent search and retrieval. Image processing analyzes scenes, speech processing transcribes the audio signal, and natural language processing determines word relevance. The integration of these technologies enables us to include vast amounts of video data in the library.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Informediatm: News-on-demand Experiments in Speech Recognition

In theory, speech recognition technology can make any spoken words in video or audio media usable for text indexing, search and retrieval. This article describes the News-on-Demand application created within the InformediaTM Digital Video Library project and discusses how speech recognition is used in transcript creation from video, alignment with closed-captioned transcripts, audio paragraph s...

متن کامل

Story Segmentation and Detection of Commercials in Broadcast News Video

The Informedia Digital Library Project [Wactlar96] allows full content indexing and retrieval of text, audio and video material. Segmentation is an integral process in the Informedia digital video library. The success of the Informedia project hinges on two critical assumptions: that we can extract sufficiently accurate speech recognition transcripts from the broadcast audio and that we can seg...

متن کامل

University of Central Florida at TRECVID 2004

This year, the Computer Vision Group at University of Central Florida participated in two tasks in TRECVID 2004: High-Level Feature Extraction and Story Segmentation. For feature extraction task, we have developed the detection methods for “Madeleine Albright”, “Bill Clinton”, “Beach”, “Basketball Scored” and “People Walking/Running”. We used the adaboost technique, and has employed the speech ...

متن کامل

Linking visual and textual data on video

The Informedia Digital Video Library Project at Carnegie Mellon University [1] combines speech, image and natural language understanding to automatically transcribe, segment and index video for intelligent search and image retrieval. Since 1995, thousands hours of video (over two terabytes of data) have been collected, with automatically generated metadata and indices for retrieving videos from...

متن کامل

Informedia: News-on-Demand Multimedia Information Acquisition

In theory, speech recognition technology can make any spoken words in video or audio media subject to text indexing, search and retrieval. This article describes the News-on-Demand application created within the InformediaTM Digital Video Library project and discusses how speech recognition is used for transcript creation from video, time alignment of closed-captioned transcripts, a speech quer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995